ru 

s 

Q 
W 

a 
ui 

0 

ru 



1 ATGCTGGGGG CAGTGGAAGG CCCCAGGTGG AAGCAGGCGG AGGACATTAG 
51 AGACATCTAC GACTTCCGAG ATGTTCTGGG CACGATCAAG CACCCCAACA 
101 TTGTAGCCCT GGATGACATC TATGAGAGTG GGGGCCACCT CTACCTCATC 
151 ATGCAGCTGG TGTCGGGTGG GGAGCTCTTT GACCGTATTG TGGAAAAAGG 
201 CTTCTACACG GAGCGGGACG CCAGCCGCCT CATCTTCCAG GTGCTGGATG 
251 CTGTGAAATA CCTGCATGAC CTGGGCATTG TACACCGGGA TCTCAAGCCA 
301 GAGAATCTGC TGTACTACAG CCTGGATGAA GACTCCAAAA TCATGATCTC 
351 CGACTTTGGC CTCTCCAAGA TGGAGGACCC GGGCAGTGTG CTCTCCACCG 
401 CCTGTGGAAC TCCGGGATAC GTGGCCCCTG AAGTCCTGGC CCAGAAGCCC 
451 TACAGCAAGG CTGTGGATTG CTGGTCCATA GGTGTCATCG CCTACATCTT 
501 GCTCTGCGGT TACCCTCCCT TCTATGACGA GAATGATGCC AAACTCTTTG 
551 AACAGATTTT GAAGGCCGAG TACGAGTTTG ACTCTCCTTA CTGGGACGAC 
601 ATCTCTGACT CTGCCAAAGA TTTCATCCGG CACTTGATGG AGAAGGACCC 
651 AGAGAAAAGA TTCACCTGTG AGCAGGCCTT GCAGCACCCA TGGATTGCAG 
701 GAGATACAGC TCTAGATAAG AATATCCACC AGTCGGTGAG TGAGCAGATC 
751 AAGAAGAACT TTGCCAAGAG CAAGTGGAAG CAAGCCTTCA ATGCCACGGC 
801 TGTGGTGCGG CACATGAGGA AACTGCAGCT GGGCACCAGC CAGGAGGGGC 
851 AGGGGCAGAC GGCGAGCCAT GGGGAGCTGC TGACACCAGT GGCTGGGGGG 
901 CCGGCAGCTG GCTGTTGCTG TCGAGACTGC TGCGTGGAGC CGGGCACAGA 
951 ACTGTCCCCC ACACTGCCCC ACCAGCTCTA G (SEQ ID NO:l) 

FEATURES: 

Start Codon: 1 
Stop Codon: 979 

Homologous proteins: 

TOP 10 BLAST Hits 

CRA | 18000004983962 /altid=gi | 
CRA 1 18000004936440 /al ti d=gi | 
CRA | 223000002652742 /altid=qi 
CRA 1 18000005144641 /altid=gi | 
CRA | 18000004932361 /altid=gi | 
CRA 1 117000066864297 /al ti d=gi 
CRA 1 149000126089143 /al ti d=gi 
CRA 1 224000007378166 /al ti d=gi 
CRA 1 114000110934306 /al ti d=qi 
CRA 1 18000005191499 /al ti d=gi | 



4502553 /def=ref|NP_003647.1| (NM_. 
3122310 /def=sp|Q63450|KCCLRAT CA. 
115928726 /def=gb|AAHl4825.1|AAHl4. 
3114436 /def=pdb|lA06| Calmodulin. 
406113 /def=gb|AAAl9670.1| (L24907. 
19966875 /def=ref |NP_065130.1| (NM. 
1 14422219 /def=emb | CAC41379 . 1 1 (AL . 
116755792 /def=gb|AAL28100.1|AF428. 
114196445 /def=ref|NP_065172.1| (N. 
4007153 /def=emb|CAAl9296.1| (AL02. 
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Blast hits to dbEST: 
CRA N umber qi 



Number Score 



CRA 1 58000099505996 
CRA 1 164000139365918 
CRA 1 58000099322782 
CRA 1 78000169264025 
CRA 1 225000015220001 
CRA 1 225000015219990 
CRA 1 225000001633290 
CRA 1 61000077034868 
CRA 1 335000490524629 
CRA 1 146000055060127 
CRA 1 61000077035673 
CRA 1 1000488750278 gi 
CRA | 225000001678100 
CRA | 222000012165952 
CRA | 224000004550264 
CRA | 162000005790240 
CRA | 225000000831163 



Expect 



91 
9] 

gi 

91 
91 

gi 
gi 
gi 

gi 
gi 



12943070 1459 bits 
126753711415 bits 
128991841215 bits 
14067900 1134 bits 
185233061130 bits 
18523305 1130 bits 
157500441102 bits 
14446412 1063 bits 
8278341 995 bits 
10205334 954 bits 



14446457 890 bits 
15128333 884 bits (446) 
gi 115752845 882 bits 
gi 1 18781967 850 bits 
gi 115947485 718 bits 
gi 19185548 706 bits 
gi 115496148 317 bits 



(736) 
(714) 
(613) 
(572) 
(570) 
(570) 
(556) 
(536) 
(502) 
(481) 
(449) 
0.0 
(445) 
(429) 
(362) 
(356) 
(160) 



0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 
0.0 

0.0 
0.0 
0.0 
0.0 
6e-85 
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EXPRESSION INFORMATION FOR MODULATORY USE (library source): 
qi Number Organ Tissue Type 



91 

gi 



10205334 
15496148 



eye 
brain 



retinoblastoma 
hypothalamus 
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in 
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ru 
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1 MLGAVEGPRW KQAEDIRDIY DFRDVLGTTK HPNIVALDDI YESGGHLYLI 
51 MQLVSGGELF DRIVEKGFYT ERDASRLIFQ VLDAVKYLHD LGIVHRDLKP 
101 ENLLYYSLDE DSKIMISDFG LSKMEDPGSV LSTACGTPGY VAPEVLAQKP 
151 YSKAVDCWSI GVIAYILLCG YPPFYDENDA KLFEQILKAE YEFDSPYWDD 
201 ISDSAKDFIR HLMEKDPEKR FTCEQALQHP WIAGDTALDK NIHQSVSEQI 
251 KKNFAKSKWK QAFNATAWR HMRKLQLGTS QEGQGQTASH GELLTPVAGG 
301 PAAGCCCRDC CVEPGTELSP TLPHQL (SEQ ID NO: 2) 

FEATURES: 

Functional domains and key regions: 

Prosite results: 

PDOC00001 PS00001 ASNLGLYCOSYLATTON 
N-glycosylation site 

254-267 NATA 

PDOC00004 PS00004 CAMP_PH0SPH0J5TTE 

camp- and cGMP-dependent protein kinase phosphorylation site 
219-222 KRFT 

PDOC00005 PS00005 PKC_PHOSPHO_SrTE 

s Protein kinase C phosphorylation site 

'f Number of matches: 3 
C3 1 28-30 TTK 

Q 2 70-72 TER 

$ 3 204-206 SAK 

P 

P PDOC00006 PS00006 CK2_PH0SPH0_3ITE 

£3 Casein kinase II phosphorylation site 

ry Number of matches: 9 



a 
py 



1 


55-58 


SGGE 


2 


70-73 


TERD 


3 


107-110 


SLDE 


4 


122-125 


SKME 


5 


204-207 


SAKD 


6 


236-239 


TALD 


7 


245-248 


SVSE 


8 


279-282 


TSQE 


9 


289-292 


SHGE 



PDOC00007 PS00007 TYR_PHOSPHO_SITE 
Tyrosine kinase phosphorylation site 

62-69 RIVEKGFY 



PDOC00008 PS00008 MYRISTYL 

N-myristoylation site 
Number of matches: 3 



1 


128-133 


GSVLST 


2 


283-288 


GQGQTA 


3 


299-304 


GGPAAG 



PDOC00100 PS00108 PROTEINLKINASEJST 

Serine/Threonine protein kinases active-site signature 

93-105 IVHRDLKPEN LLY 
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Membrane spanning structure and domains: 

Helix Begin End Score Certainty 

1 127 147 0.883 Putative 

2 155 175 1.515 Certain 

3 288 308 0.649 Putative 

BLAST Alignment to Top Hit: 

>CRA | 18000004983962 /altid=gi 14502553 /def=ref |NP_003647.1| 

(NMJ003656) calcium/calmodul in-dependent protein kinase I 
[Homo sapiens] /org=Homo sapiens /taxon=9606 /div=PRl 
/dataset=nraa /length=370 
Length = 370 

Score = 661 bits (1688), Expect = 0.0 

Identities = 326/370 (88%), Positives = 326/370 (88%), Gaps = 44/370 (11%) 
Frame = +3 

Query: 126 MLGAVEGPRWKQAEDIRDIYDFRDVLGT 209 

MLGAVEGPRWKQAEDI RDIYD FRDVLGT 
Sbjct: 1 MLGAVEGPRWKQAEDI RDIYD FRDVLGTGAFS EVI LAEDKRTQKLVAI KCIAKEAL EGK E 60 



Query: 210 IKHmiVALDDIYESGGHLYLIMQLVSGGELFDRIVEKGFYTERDASR 353 

Q IKHPNIVALDDIYESGGHLYLIMQLVSGGELFDRIVEKGFYTERDASR 

$ Sbjct: 61 GSMENEl^VLHKIKHPNIVALDDIYESGGHLYLII^LVSGGELFDRIVEKGFYTERDASR 120 

Query: 354 LIFQVLDAVKYLHDLGIVHRDLKPENLLYYSLDEDSKIMISDFGLSKMEDPGSVL5TACG 533 

LIf=QVLDAVKYLHDLGIVHRDLKPENLLYYSLDEDSKIMISDFGLSKMEDPGSVLSTACG 
Sbjct: 121 LIFQVLDAVKYLHDLGIVHRDLKPENLLYYSLI)EDSKIMISDFGLSKNIEDPGSVLSTACG 180 

L Query: 534 TFGWAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPPFYDENDAKLFEQILKAEYEFDSP 713 

H TFGWAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPPFYDENDAKLFEQILKAEYEFDSP 

W Sbjct: 181 TPGWAPEVU\QKPYSKAVDCWSIGVIAYILLCGYPPFYDENDAKLFEQILKAEYEFDSP 240 

Q 

W Query: 714 YWDDISDSAKDFTRHU^EKDPEKRFTCEQALQHPWIAGDTALDIW 893 

£3 YWDDISDSAKDFIRHU^EKDPEKRFTCEQALQHPWIAGDTALDKNIHQSVSEQIKKNFAK 

fU Sbjct: 241 YWDDISKAKDFIRHLMEKDPEKRFTCEQALQHPWIAGDTAU^ 300 

Query: 894 SKV^QAF^TAWPJHMRKLQLGTSQEGQGQTASHGELLTPVAGGPAAGCCCRDCCVEF>GT 1073 

SK^QAR^TAVVRHMRKLQLGTSQEGQGQTASHGELLTPVAGGPAAGCCCRDCCVEFCT 
Sbjct: 301 SKWKQAFNATAWRHMRKLQLGTSQEGQGQTASHGELLTFVAGGPMGCCCRDCCVEFCT 360 

Query: 1074 ELSPTLPHQL 1103 

ELSPTLPHQL 

Sbjct: 361 ELSPTLPHQL 370 (SEQ ID NO: 4) 
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rtnmer search results (Pfam): 

Scores for sequence family classification (score includes all domains): 
Model Description Score E-val ue 



N 



M 

a 
a 

a 



PF00069 Eukaryotic protein kinase domain 271.6 

CE00022 CE00022 MAGUK_subfamily_d 119.7 

CE00359 E00359 bone_morphogenetic_protein_receptor 6.5 

CE00031 CE00031 VEGFR " 4.3 

PF01496 V-type ATPase 116kDa subunit family 1.8 

CE00292 CE00292 pnc_rnembrane_span -89.8 

CE00287 CE00287 PTK_Eph_orphan_receptor -96.6 

CE00291 CE00291 PTlC_fgf_receptor -123.4 

CE00286 E00286 PTIC_EGF_receptor -151.4 

CE00290 CE00290 PTK_Trk_family -204.5 

CE00016 CE00016 GSK_glycogen_synthase_kinase -271.7 

Parsed for domains: 



Model 


Domain 


sea-f 


sea-t 




hmm-f hmm-t 




score 


E-val ue 


PF01496 


1/1 


85 


95 




1 


11 [. 


1.8 


7.5 


CE00359 


1/1 


93 


146 




272 


327 


• • 


6.5 


0.36 


CE00031 


1/1 


79 


163 


■ ■ 


1053 


1137 




4.3 


0.2 


CE00290 


1/1 


2 


212 


■ • 


1 


282 




-204.5 


0.4 


CE00292 


1/1 


3 


227 


■ ■ 


1 


288 




-89.8 


0.0011 


CE00291 


1/1 


1 


230 




1 


285 




-123.4 


0.11 


CE00287 


1/1 


2 


230 




1 


260 




-96.6 


0.044 


CE00286 


1/1 


1 


230 


i'. 


1 


263 




-151.4 


0.095 


CE00022 


1/1 


30 


232 


m m 


75 


283 




119.7 


3.1e-35 


PF00069 


1/1 


25 


232 




43 


278 


.] 271.6 


l.le-77 


CE00016 


1/1 


1 


302 


i" 


1 


433 [] -271.7 


0.12 



l.le-77 
3.1e-35 
0.36 
0.2 
7.5 
0.0011 
0.044 
0.11 
0.095 
0.4 
0.12 



1 
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1 AAACCGACCT TTGGCCTCTT 
51 CTAACCTGGA CCCCAGCCAT 
101 ACCTCGGTCC 1 1 1 1 IGGCCT 
151 CCTCCCCAAT TCCCTCTGCC 
201 TGAGGAGCAA GTGTCTGGGG 
251 CCGGGGCCTC AGTCTTCCCA 
301 ACCTCCACTC TCTTCATTGC 
351 CACTCCCCCA TCCCCCACCA 
401 CTTCAGGGCA CGGAGAAGGG 
451 GGTCTGAAAG AAAATCCACC 
501 GCTTCAGACA GACGGGGTTC 
551 ATTTGAAAAT CATCTCTCCG 
601 CCCGCTTAAG ACCAAGGGCG 
651 AAGCAGGGAG ATGGCCTCCA 
701 GCCTCGCCTC CCAATCCCGG 
751 ACATCAGAGC CGCAGGCGGG 
801 CAGCTCCAGC AAGAGCGCGG 
851 CCGCGGCCAC AGCTCGGCGC 
901 CGGCGGGGCA GCCGCAGGTA 
951 GCGCTGCGTG GGGGCGGTGG 
1001 AGGAGTGGCC ACTGGGCACT 
1051 GGCTGGGCCT TTCTCCACCT 
Q 1101 CCTATGGGAC ACTGAGTGGG 
Q 1151 TCAGCTGGGC CTGACTGGAA 
%fi 1201 TGGGGGGGCT TGCGGAGTGT 
C3 1251 ATGGGTGGAG GATCTGGGAA 
P 1301 CACGTGTTTT GTGTGCTGGT 
0 1351 AGGAGGGGGT GTACAATTGT 
fjj 1401 AGGACTTGGG GTTTGGTTTG 
s 1451 AGGAGCTGGA GGGTGTAGGG 

□ 1501 GTGTCTGGGT GTCATCTTGT 
Jj 1551 GTAAGAGGGA GCTGGGTGAG 
n 1601 GTGCAGGGCT GTCTGGGGCA 
\i 1651 TGGGAGTACA TGTATGATCA 
HJ 1701 TGGATCTGGG AGGCAGGTGT 
1751 GTGGCATGTC TGCTACAGGG 
fU 1801 GTAGGCCTCC ATGTGGGTTT 
1851 TCTCCTTTCC TAGAAGATTC 
1901 GGAAAAAGGT CCTGTTCCAG 
1951 AGGGCACCTT CTCCTCATTC 
2001 CTCTCCCTGA GGTCACAAAG 
2051 GGATGGGTTG GGAGGCTGTG 
2101 ATTGGAAGTA CCAGCCAAGG 
2151 TAGCTGGGGG CCCCAGGCCA 
2201 TACCTAACTC CAACCTCAGT 
2251 GAACTCCCAC CCTCTCCTGC 
2301 CCAGGCTGGA TAGGGTTAGA 
2351 CCATTTCTCA TGCTCCCGGG 
2401 TGGGGAGCTC CAGCTGTGGC 
2451 GCTATTTAAA AATGTGGCTG 
2501 ACCTGATTTC GGAGAGGCCG 
2551 ACACCCCAGC AATCCCCAGG 
2601 CAGGGGGATC CTCCAGGCTT 
2651 TACTGCCAGA AGTCAGGGTT 



GCCTGCCGTC CTAGTTGCAG GCTCTCTCCC 
CAAACTCTGG AGCCCCGCCA GTCACGTGAC 
GTTTCCTTCA GGATCCCGAT TTAACTTCCT 
CCCAATACCT CTAGGCACCA CCACCCGCTC 
CTGAAGCCTC AGCTCCATCT TGCAGAGGAA 
CCTGTCAAGT GGGGCCCACA CCCTGCGACC 
CTAGTCTTGC CCGGTCCTTC CCCACTCCCT 
GACTCCCGTG CAGTTCCAGG GCCTGTTTCC 
AGACAGAGCC CTAAGGGAGG TCGCAGAACT 
AGGCCACAGG GTGAGTTTGG CCGGCCTCTA 
GAATCCTGCT TTGCTTCCGA CCACCCGCTG 
GGCCTCAGTT GTCCCCTCTG TGAAATGGAC 
GGAAGCGTCC AGCAGGAGAT CTCTGACCAG 
CCCGTGCCCC TTCCCCAGCC TTGGAGCGGT 
GTCCCTCCGC CGCAGGCTCC ACCTCCACTG 
CGGAGAGAGC CGCCGAGCCG AGCCGAGCCC 
GCGGGTGGCC CAGGCACGCA GCGGTGAGGA 
CAACCACCGC GGGCCTCCCA GCCAGCCCCG 
CAGCCGGGCC CCCCATCCCT GCACCCCTGG 
GAGCCCCTAG CCTCTGGGTA TCCTTTCCCA 
CTCCCGGGCG GGCTGGACCC TGAGGGGCAG 
CTGTCCCAGG CCCAGCAGGT GCCAGGCGGG 
TAATAGAGAA GGGGGCCTGT GTGAGCGCCT 
GGGCGTGGGC ATTTGGAGGT ATCCATGGGG 
ACTGTCTTAG GACAGGCGTG TGGGTCAGAC 
TCTGTGTGTT I I I IGI I CCA GAGGGGTGTC 
ATTTGGCTCT CAGGGTCTTA AGTCAGAGTT 
GTACTGAGGA TGTTTGGAGT TAGGTGTGTA 
GAATACAGGA GCTTCCAGGG GATGGGGTAG 
TACGTCTGGT ATATGAGGGT GTGTGTGTGT 
GTGGGTGCGG GTGGATGTGT GTTTTGGGGT 
GGATGTTTGG ATGGACAGGC AGGTGTTCGG 
CTGTGTGGTG TGGACATGTG TGCTGATGTC 
GGTGTCACGG GATGTGGATA CAAGGCGTAC 
TTGAGTTCAG GGCTGTGGAG GGGGCTTGGT 
ATGTGTGTGG ATCTGTGAGG GTTGTATTTG 
CAGACTCTGC CTCTAGAGCT TACACTCGAG 
TGCCCCTGGA TGGGTGGGCA GGGTCCCCTG 
GAGTGGAATC TCACACCAGA GGCCCTAGTC 
TCCCTTAGAG AAAAAGAGAG AAGGAAAGTG 
CATGCTGGGC TCTGTTTTGG CCTCATCTGT 
TTCTCTGAAT GGGGCCCATT CTGGCTTCAT 
CCATTCGATG GCCTTTGCCC TCAGCAAGCT 
GGTGTCATTA GGGCCTCTGG AGCCAGCCTC 
CTCCCCATTC TTCATCTGAT AAATGGGAGA 
TGGATGAGAC AGACCTCAGC AGAGGAAGGG 
TGGGGCCAGG AAGGGACAGA GTGAGCAGGA 
ACCCAGATGG GGAGTCAGGA GGGAGAGGTC 
TGTTGTTGCT GTGGTAACAG TGCAGAAAGA 
AGATGTTGCT GGAAGCCCAG GCTGCTGGAA 
GGGAGTCGGG GGAAGGAGGA GGGAAAGGAG 
GTGGGGCGGG GACATCACTG GTTCTGGGGA 
CTACCAGCTG CTCTGGGGGT TTATCTGTTG 
TCCCTAGGTG CTTGGATTTG GATAGGGGGA 
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2701 AAACTGGGAA GAGAACTAGA ATAAATGAAT GAATGAATGC ATGACTTTGT 
2751 TAAATAAAGA ATTTTGCTGC CACTGTGAAA GGTTTTTCTC TAGGCATGAG 
2801 AATTTGCTGA ATGTTGAATA AACAAATGAA TGI I IGI IGA ATGATTTTGT 
2851 CAAATGGATG AATCAAGGAT GAATAAATGC AGGTTGAATG ACTGAATGGG 
2901 GCCTGCAGTA AATTCCCAGA CAGAGGGCTG GGCTCTGCTG AGTCTCCTCC 
2951 TTCCATTCTC CTTACAGGAG CCCTGGCTGT GGTCGGGGGG CAGTGGGCCA 
3001 TGCTGGGGGC AGTGGAAGGC CCCAGGTGGA AGCAGGCGGA GGACATTAGA 
3051 GACATCTACG ACTTCCGAGA TGTTCTGGGC ACGTGAGTCC AGGGCAGGAT 
3101 TGGGTGCTGG ATGGCTGAGG GAGGCTGAGT CCAGGGTGGG GCTTCCTCTG 
3151 GTCAATTAAT GCTTCCTGTT TCCCACAGCC CAGGCCCTGT GGCAGCACTA 
3201 TCTAGGGCCT AAACTGTCCC CAGCTTTTCA CTTCTGGATG ACAGTGGGTG 
3251 GGACACGGGC TGCTCTCCCA ATAGCCCTGG GTTCTTGAAG AGAAAGAAGT 
3301 CGAGAGAATG AAGGTGCCAG TCAGTCCATT TAACTTGCTG CCAAGAGCTA 
3351 AGTGTTCTAG CCTAGGTTTG GGAACTGAGG CTGGAGATGG CTCTG I I L I I 
3401 GGTGCTGGGA ATGCAGAAAT AACTCAAACC TGGTCTCTGC CCTTCAAGTT 
3451 GATCCCAGAC ATGTGCAAGA GACAGACCTA CAGAAAATGA CAACAGGGTG 
3501 TGTGCTGTGC TCCAATTAAG GTTGGGATTG AGGGCTTTGT GGAGCCCAGA 
3551 GAGAGCTGTG CCTTCTGCCT GGGGGAAAAC TTCCTGGAGA ATGGGGCATT 
3601 AGAGCTGGGG ACTGAAGGAT GGGTAGGTGT GCACTTGTCA GAGAGGAAGA 
3651 AGGACATTCC AGGCAGAAGG AATAGCATAA ACAAAGGCTT AGAGGCATGG 
3701 TTCTATGTGG AGAGAGGTAG AGTGTGATGG AGCTTAAAAT CACAGGCTGG 
3751 GGGGAGAGTG GAAAAAGGGG CTGGAGATGA AAGTGGGACA GTTTGTGTAG 
3801 GGTTTTGGAA GCCAGGCCAG GGAGGCTGGA TATTGTCCCA TAGGCCACCG 
3851 GGAGACACTT AAGALI I 1 1 I GGCAGGTGTG CAATTCAGGA TAGTCACTCT 
3901 GGCCACAGCT TGGAGGGTAA ATTGGAGAGG GACAAGACTG GAAACCAGTG 
3951 ATGAGGTTAC TACAGTAACT AATTATCCCT GAGGATTGAA ATTTCACCAC 
4001 GAGAGATGCT TTTCTTTGAC TTATGACTTC TTATTCTCCC AGAGAAAGCA 
4051 AACAGATGTG GAAAGAATAC CCTAGCAAAT CCTCTTTAAT CAGTTAACTT 
4101 TAGTTAAATG AGTTTATTTG TTCCTTTTTA AGAACCTGTT CTAAAACACT 
4151 GCTTCTTAAA GTTCAATGAG CATACAAATC ACCTGAGGAT I I IGI IAAAC 
4201 TGCAGATTGA TTTAGTAAAT CTGGGGCAGG GCCTAAAGTT TTGCATTTCT 
4251 I I I I 1 1 I I IC I I I I I I I IGA CCCAGGATCC AAAGCAGTAG AGATTTTGCA 
4301 TTTCTAAAAA AGTTCCCGGG TGATGCTGAT GGTTCTTTAA GGTTCTAAAG 
4351 GGTGTTAAAT TAGCCATGAC TCGAATTAGC AGAAAAAGGG ATGAACCAAC 
4401 TGTACACATA ATCCAAAAGC CCAGGGGTAG ACCTCAGGCA TGGCTGGATC 
4451 CAGAGGGCCA CATAATGTTA TCAGGAAATA TATTTGGCCA TTTCTCAGGT 
4501 TGGACTTCCT TTGTGTTAAT TTCATTCCCA AGCAGGCTCT CCCCAGGTGG 
4551 TGGCAAAGAT GATCGCCATT AGCTCCAGGC TTACATCCTA CCAGCTCAAC 
4601 AGGAGACTCA TTCTCAAAGT GCTAGTAAGC TGGCTTGCAT CACATGACCA 
4651 ATTACTGTGG CCAGGGGAGA GACTACTTTG ACTGGCCAGG CCTGGGTCAT 
4701 GTGACCATCT CTGGAGCCAG GGGATGGATG AGTGACTAGG GGAGGGTCAT 
4751 CCACGTCCTT GGTCCAGCAG TGGTCACAGA ACCCATAGGG AATGGAGGAG 
4801 AGGCTGGAGG GAAGCTGGGG TTCCAGTTCT TCACCTTGTG AATCCCCTCT 
4851 CCCGATAGGG GGGCCTTCTC GGAGGTGATC CTGGCAGAAG ATAAGAGGAC 
4901 GCAGAAGCTG GTGGCCATCA AATGCATTGC CAAGGAGGCC CTGGAGGGCA 
4951 AGGAAGGCAG CATGGAGAAT GAGATTGCTG TCCTGCACAA GTGCGTGGGC 
5001 CACAGCCTTT CCCTGCCCCA AGCTGACCCT GCCTTGGCCC TCCCATCCTC 
5051 CTCCTTTCCT GCTTTGGACA AATCATTTAA ACTCTCTAAG CCTTAAATTG 
5101 CCCCTTTATA AAATGGGGAT CACAATTTCC ACTTGGCAGG GTTGTGGGGA 
5151 ACATCAGAAG TCCTTTATTT CAAGTGCCTG GCCTAACATG ACAGATGTGA 
5201 TGGAGGTGCC AGTGCTTAGT CACAGGGGTT TAACTGTTCA ATCAGGTGTA 
5251 AAGATCCATC CTGAACATGG CTTGGACCCA CATATCTCAG TTGGTGTTGT 
5301 CTCTGGACCT ACCTCAAGTT CCCCTCACAT ATTAAAACCA CTCAGCAAGT 
5351 TTAAAAATGA CTGTCTGCTG ACCCCCAGAC TAAATCCACA ACCAACTGGT 
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5401 CTATGAATTG CTCATGCTGA TATGAAACCT CCTGTCCTCA CTGGAAAACT 
5451 TACAGAGAAT CACTTCCAAT CTCTCCCCTG AGCTTCCAAC CACCCTGGGC 
5501 AGATAATTTT 1 1 I I I I I I I I TTGAGATGGA GTCTCACTCT GTTGCCCCGG 
5551 CTGGAGTGCA GTGACGCAAT CTTGGCTCAC TGCAACCTCT GCCTCTTGGG 
5601 TTCAAGCAAT TCTCTTGCTT CAGCCTCCCT AGTAGCTGGG ATTACAGGCA 
5651 CCTGCCACCA CGCCCGGCTA Al I I I IGTAT TTTTAGTAGA GATGGGGTTT 
5701 CGCCATGTTG GCCAGGCTGG TCTCGAACTC CTGACCTCAG GTGATCCACC 
5751 CGCCTCGGCC TCCCAAAGTG CTAGGCATGA GCCACCACAC CCAACTCCTG 
5801 GCAGAGCATT TCTAATAAGA CCCAGAGAGG ACAGGGATTT GTATACAGTC 
5851 ACATGGCAAG TTTGTGGCAG AGCTGAGCCT TCCTCATCAT CAAGATCAAT 
5901 TATCGCCTGA CCAACACGGA GAAACCCTGT CTCTACTAAA AATACAAAAT 
5951 TAGCCAGGCG TGGTGGCACA TGCCTGTAAT CCCAGCTACT TGGGAGGCTG 
6001 AGGCAGGAGA ATTGCTTAAA CCCGAGAGGT GGAGGTTGCG GTGAGCCGAG 
6051 ATCACACCGT GCATTACACT CCAGCCTAGG CAACAAGAGC AAAACTCCAT 
6101 CTCAAAAAAA AAAAAAAAAC AAAAAAAAAA CAAAAACGCC AGGCGCAGTG 
6151 GCTCACGCCT GTAATCCCAG CACTTTGAGA GGCTGAAGTG GGCAGATCAC 
6201 CTGAGGTGGG GAGTTCCAAA CCAGCCTGAC CAACATGGAG AAACTCCGTC 
6251 TCTACTAAAA ATACAAAATT AGCTGGACAT GGTGGCGCAT GCCTGTAATC 
6301 CCAGCTACTT GAGAGGCTGA GAAAGAAGAA TCACTTGAAC CCAGGAGGCA 
6351 GAAATTGTGA TGAGCCAAGA TCATGCCATT GCACTCCAGC CTGGGCAACA 
6401 CTCCAGCCTG AGCAACAAGA GTAAAACTCC GTCTCAAAAA AAGAAAAAAA 
6451 AAATCAATTA CCATTATTGT TTCACTTATG AGTATTTACC GTGTGCCAGG 
6501 CACTGTGCCA AGCACCTTAC CTGCATTATC TCACATGATC CTCACTCCAA 
6551 CTCTTTGAGG GAAGTACTAC CATTGGCTTC ATTTTATAGA TGAAGAAACT 
6601 GAGGTTCAGA GAGGTTACAT TAAATCTAGC ACCTACCCTG TACCAGGTGC 
6651 TGGAGGAACA GTGGCAAGCA AGACAAAGCC TCTGGATTCG GGGAGCTTAT 
6701 GTCTGGTGGG GGAGGCTGAC AAACATGTAA ACACAGAAAA CTATATATAT 
6751 ATA I 1 1 I 1 1 1 TGAGATGGAG TTTTGCTCTT GTTGCCCAGG CTGGAGTGTA 
6801 ATGGCATGAT CTCGACTCAC TGCAACCTCC GTTTCCCAGG TTTMGCAAT 
6851 TCTCCTGCCT CAGCCTCACA GATAGCTGGG ATTACAGGCA TGTGCCACCA 
6901 TGCCTGGCTA Al I I I IGTAT TTTTAGTAGA GATGGGTTTT CGCCATGTTG 
6951 GCCAGGCTGG TCTCGAACTC CTGACCTCAA GTGATCCGCC TGCCTTGGCC 
7001 TCCCAAAGTG CTGGGATTAC AGGTGTGAGT CTCTGTGCCT AGCCAGAAAA 
7051 CTCTTAAGAG GTATGTATCA GGCTGGGTGC AGTGGCTCAC TGGTGAAAAG 
7101 ATCTGCACCC AAATAGCATG TGACGGGCAG GATTTGGACC CAGGTCTGTG 
7151 TATGCCAGAG CCCAGTGTTT ATCCCTCTGC TCTCTCACCT TCCAAAAAAT 
7201 GGTAATAAAC CATGGTAAGC TAGC I I I I CC CTTTGGGGAC GAGATCCTTG 
7251 GTTTGTCCTA CCCAGGTATG TAGGCAGTGG TCGGGGGTTG GGGGTGGCTG 
7301 AGCTGTCCTG AGCTCTAAAC CGCTGI 1 1 I I 1 1 1 1 1 1 1 1 I I TTTTGAGACA 
7351 GGGTCTTACT CTGTTGCCCA GGCTGGAGTG CAGTGGCTAG TCACAGGTGC 
7401 AATCATAACA GACTGCAGCT TTGAACTGCT GGGGCCAAGT GATCCTCCTG 
7451 CCTCAGCCTC CCAAGTTCCC AAGTAGCTTG GACTACAGGT GCACACCGCC 
7501 ATGCCTGGCT AAACCACCTC ATTTCTCCTT TCAGGATCAA GCACCCCAAC 
7551 ATTGTAGCCC TGGATGACAT CTATGAGAGT GGGGGCCACC TCTACCTCAT 
7601 CATGCAGCTG TGAGTGGCCC AACCTCTGCC CTGCCCCCAC ACCTCTCCCA 
7651 GCTGTCCCAA CCCTCTTTGC CAGACTGCCC TATCCCCTGC TGCAGGGTGT 
7701 CGGGTGGGGA GCTCTTTGAC CGTATTGTGG AAAAAGGCTT CTACACGGAG 
7751 CGGGACGCCA GCCGCCTCAT CTTCCAGGTG CTGGATGCTG TGAAATACCT 
7801 GCATGACCTG GGCATTGTAC ACCGGGATCT CAAGGTGGGG CTCAAGGGGG 
7851 TGTGGTGAGC TAGGGTACCC AGGGGTGGGG CCTTTGCAAA CCCCAAACTG 
7901 TCTGACCTTG GGCAACTTTC ACCCCCTCAC TGAGCCTTGG ATTTCCATCT 
7951 ACAAAGTGGA TCTTGTAACC TTTAAACTGC CTCCTCCCAT TCTAGTCCAG 
8001 ATACTCAAAG GAACACGAGT GAATTGTGTG GCATTTTATC CAAACAACAT 
8051 TTTGTCTTTT TCTGATTAAA AAAAAAAAAA TCTGGCCAGA CAGGATGGCT 
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8101 CACGCCTGTA ATCCCAGCAC TTTAGGAGGC AGAGACGGGT GGATCACCTG 
8151 AGGTCAGTTC GAGACCAGCT TGGCAAAACC CTGTCTCTAC CAAAAATACA 
8201 AAAATTAGCC CGGCGTGGTG GCAGATGCCT GTAATCCCAG CTACTAGGGA 
8251 GGCTGAGGCA GGCGAATCAC TTGGACCCGG GAGGCAGAGG TTGCAGCAAG 
8301 CTGAGATTGT GCCATTGCAC GCCAGCCTGG GCGACAGAGC GAGCCTGGAC 
8351 GACAGAGCGA GACTCCATGT CAAAAAAAAT AAAATAAAAA CAAAAAATCC 
8401 TATTCCCCTT CTGTAGAAAA CTTGGATGGG ACAGCAAAAC ATAAAGAAAA 
8451 AAGCCAGAAA TCCCCGAAAT CCTACTCCTC GGAAATAGCG ACGGGGCTCA 
8501 CATTTAGCAG TACATCTCAA TCCGTTCTAG GAGAAGGGCA CTTGGGGTGT 
8551 GACATGCCTG GTTTTGAATT CTGGCTCTGC TACTGCCTAA CTGTGGGTTC 
8601 TTGGGTGAGT CACTTTGCCT CCAAAGGCAT CAGTTTCCTC ATCTGTTAGG 
8651 TGAGATTATA CAGACTGGCC TAGCAGGGAA GCAGTGAGGA TGGCATTAAA 
8701 TCAAGCACTA ATCCAGGGTC TGGCATAAAA TAGGCATTCA AACATTCCTT 
8751 TAGGGCTTTA CAGTGCACAC CTGAGGTTTA GAGACAGTTC CCCCCCACAC 
8801 CCTCTTGAGC CTTGTCCTTC CTGGAATTTT TGGCCTTCTT GAGAGCTTCC 
8851 TTGATTTTCT TATGACAGCC ATGAAGCCAC AGTGGCTTTT GGGGATCCAT 
8901 TATTTCTCAG AAGGTGCTTG GAGCGGCAGA AGGTTCTACC AGCCTCTAAC 
8951 CATCTCTGAT TGCCCCTTCT CTTCCCTCCT GCCCTTCAAG CCAGAGAATC 
9001 TGCTGTACTA CAGCCTGGAT GAAGACTCCA AAATCATGAT CTCCGACTTT 
9051 GGCCTCTCCA AGATGGAGGA CCCGGGCAGT GTGCTCTCCA CCGCCTGTGG 
9101 AACTCCGGGA TACGTGGGTG CGGAGGGCCC TGGGCTGGGG CTGTGATGGT 
9151 GGGGGGAACC AGGAGTTGAA GGGCAGAGAT TTGTCACCAC CACGTCCTCT 
9201 TCCCTCCACA GCCCCTGAAG TCCTGGCCCA GAAGCCCTAC AGCAAGGCTG 
9251 TGGATTGCTG GTCCATAGGT GTCATCGCCT ACATCTTGTA AGTGGGGCTT 
9301 GGCCATGGTA GGCTGTGGCT CCAGAGTTGT CCTCTCGCCT ACTTTCCTCT 
9351 CTTCCTTCCT CTGCTCTCCC TCTGCCCTCC CTTCCTTCCC TCCCTCCCTT 
9401 CCTTCCACCA ATCAATTACC AGTATTACTT CATTCAATAG ATACTATGTT 
9451 TCAAGCACTG TGCCAAGCAA GCACTGGGGT AAATTTAGCA CAGCACAAAC 
9501 CAGACAAAGT GCCTGCCCTC AGGGAGCTGA CTTTCTTTCT AGTAGGGAAG 
sU 9551 ACAGACAATC AACAAGTAAA TAAATCTACA AACTGACGTC AGGTGATAAA 
B 9601 AATAAATACT GTGGAGAAAA ACCAAGCAGG MTAGGGAGA CGGGGTGATG 
D 9651 CCATTTCAGT AGGGAGGTCA GGGAAGGGCT CGCTGTGGAG GTGATGACCG 
U 9701 AGTGGTGAGG GAGCCAGACA TTGGAGGTGT GGGGAAAGAG TGGCATAGGC 
Q 9751 AGAAGCAATG GCAAGTGCAA AGGCCCTGAG GAGGGCAAGA TGGCGGCACA 
LH 9801 TACAAGGAAC AGAAAGGATA ATGTAGCTAG AACAGGAGTG AGCAGGCAGG 
O 9851 GCTGGTAGAG TTTATAAAGG GGGAACTCCT TCCATGGCTC CTGCCTGACC 
W 9901 CCTGAGACTG CCCCAGTGCT CCACCCCGGA GCCAACGGCA CCCGAAAGTG 
9951 GAAATGAGGA TGAGTTTCTC CCTGCCCAGG CTCTGCGGTT ACCCTCCCTT 
10001 CTATGACGAG AATGATGCCA AACTCTTTGA ACAGATTTTG AAGGCCGAGT 
10051 ACGAGTTTGA CTCTCCTTAC TGGGACGACA TCTCTGACTC TGGTATTTGG 
10101 GGCTTTGCTT TTTTCCCCTG GGCCCTGCCT CTGGTTCCTC CCTCACCTGC 
10151 TTTGGGGGCG GTCTCCCTCC TGCCTTCCTT CTGTCGGATT TTCCAGCACC 
10201 ACACAAAGAG CTGTCTTCGA GACCAGACAC CCTACCCCTT CTTCCTTCTG 
10251 CTTGGGTACT TCCTTCTGCT TGGCTCCCAG AGTGAGAAAC TAGGCATTCA 
10301 TTTGTTCAAT CTTCAAACAT AGTCTATTTG AAAATACCTC TCCCCTATTG 
10351 ACACCCTAAT GTCTAAAACA CCACCATAAA CATTTTCATC CTCCTTTTGT 
10401 GCCCCCTATT AAGAAGCAAA CCTGTGAAGC TACTATCGTT TATCATCAGT 
10451 GTGAATGCAC TGAGATTAGT CAAGAACAAC I I I I 1 1 I I I I TTTTCTTTCT 
10501 I 1 1 I IGAGAC GCAGTCTCGC TCTGTTGCCC AGGCTGGAGT GCAGTGGCAC 
10551 AATCTCGGCT CACTGCAACC TCTGTCTCCC GGGTTCGAGC AATTCTCTGC 
10601 CTCAGCCTCC CAAGTAGCTG GGATTACAGG CGCCCACCAC CATGCCCGGC 
10651 TAAI I I I I I I TGTAI I I I IA GTAGAAACAA GGTTTCACCA TCTTGGCCAG 
10701 GCTGGTCTTG AACTCCTGAC CTCGTGATCC ACCTGCATTG GCCTCCCAAA 
10751 GTGCTGGGAT TACAGACATG AGCCACTGTG CCCGGCCATA TGI I I I IU I 
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10801 AAGAGAGAAA GGAAAGAGGT GGAAGGCACG GGGTGGGAGG GCCTGAAGAA 
10851 GAGCATAGGT TGGGTGGGGT GGGGCATGGA CTGATTTQGC CTCTTTGTCT 
10901 TGATGCCAGG CCAGACCTGA GGGAGTGGGT ATGCTCTTGG QGAGTACACA 
10951 GGCAGTACCA TGCTGTCATT ATCTTTGCTT TTGTCTTGGG GGTTTAGCCA 
11001 AAGATTTCAT CCGGCACTTG ATGGAGAAGG ACCCAGAGAA AAGATTCACC 
11051 TGTGAGCAGG CCTTGCAGCA CCCATGGTGA GAATTCACAC AACCTGTGAG 
11101 CTGGGGCGGG ATTTGGGGCC CTCAGGTCTG CTTCTGCCCT CATAGGCAAC 
11151 CCACCACATA ACCCCATCCT AGGATTGCAG GAGATACAGC TCTAGATAAG 
11201 AATATCCACC AGTCGGTGAG TGAGCAGATC AAGAAGAACT TTGCCAAGAG 
11251 CAAGTGGAAG GTGAGTCCAT ATCCCTAGTT CTGGTCCCAG CCTCCCCAGG 
11301 ACTCCTCCCC ATCCCTACCC AGGCTCAGCT TGCACAGCAC CTGGCATCAC 
11351 ACTGGGCACA CAGTAACTGC TTAGGGATCC TTACTGAAGG ACTTCATTCA 
11401 TTCACTCTTT CATTCAACAA ACACTCCCAA CACCTTCTCT ATTCCAGAGA 
11451 GGGTCCCTCA CCTCCAAGTC TAGAGGAAGA AGTCTGTAAT TCTTCAGGAG 
11501 GCATCTGATC CAGCCTATGG GGTCCGAGAA AGGTCATAAA AGTGGTGATG 
11551 ACCTGACAGA GCTGTCAGTT AAGTAGGAAT TAGTGAGGCA TAGCGGAATA 
11601 ATGTCTATAG CCATTCCGGG AAGTGCAAGT GCTAAGCCTG GCCAGACTGG 
11651 AGGGGCTGAG GGGACTGAGA GGCAGGAGCC CAATTTAGAG AAGCAGGTAA 
11701 GGGGCCAGGC CTCTTAGGGC CTCATATGCC ACAGAGGAGC ACCAACTTGA 
11751 TCCTGAGGGC ACTGAGGAGC CCCAGAAGAA TCTTAGGCAA GTATTTGCTG 
11801 CATAGAAAGG GCTCTCAGGG CCAGGCATGG TGGCTCACGC CTGTAATCCC 
11851 AGCACTTTGG GAGGCCGAGG TGGTTGGATC ACCTGAGGTC AGGAGTTCAA 
11901 GACCATCCTG GCCAACATGG CAAAACCCTG TCTCTACTAA AAATAAAAGA 
11951 ATTAGCCACA CATGGTGGTG CGTGCCTGTA ATCCCAGCTA CTTGGGAAGC 
12001 TGAGGCAGGA GAATTACTTG AACCTCGGAG ATGGGGGTTG CAGTGAGCTG 
12051 AGATCGCGCC ACTGCACTCC AGCCTGGGCA ACAAAGTGAG ACTCCACCTC 
12101 AAAAAAAAAA GAAAGAGCTC TCAGGATGCA GAGAATGGCA TGGAGTAAAG 
12151 ACTGGGTGAC GCATTAGGAG GCTGTGGCAG AGATACAGGC AGGAGATGGT 
12201 AAGGGTTTGG AACCACAGTA GCAGCAACAG GGGGCAGAGA ACAGTGGTTG 
12251 ATCCAGGAGT CATTTAGGAG GTGAAACTGA CAAGACATGA CGATGCAATG 
12301 GATGTTGGGG GAAAGAGATG TCAAGGGCTG GCCCAAGACT GTGGCTGGGA 
12351 ACAGAATGGA TGGTGGTGGT ACCATGACTG AGATGGTTAT CACAGGGACA 
12401 GAAACATGTT TTGGGGGGAT GGTTTTAGTT TTAGACATGG TGAATTTGAG 
12451 GGGTGTGTGG GACACCTAGG TGGAGATATT GAATAGAGAC ACACCTGAGC 
12501 AAGTTACTTC AGCTTTCTGT GCCTCAGTTT CCTCCTTTGA AAATGATAAT 
12551 AGTACCTACC TCAAAGACTT TCATGAAGAT TAAATGAATT ACTACGTAAA 
12601 GTGCTTAGAA CAGTGCCTGA CATACAGTGC TATAGTGTTT GCTATTACAT 
12651 ATTAATATGA ATTATAGTTA TGTTTCTATT TATATATATA GATACACATA 
12701 CATCTAACAT ATGTGCGTGT GTGTGTGTAA ATATATAATA AAGCCTTGTA 
12751 GAGGTTTTTG GGGGGCTTTA GGGGAATTAA TAAAATAACT CCTGAATGAA 
12801 AATAACAGAA CAATTGCAAG AATCCCACTG CGCCCCTGCC CCATGACTTG 
12851 ACTCTCTCAA AAGTCCTTTC TCCCCTCTCC CTTCAATGCC TTCAATGCCA 
12901 GCAAGCCTTC AATGCCACGG CTGTGGTGCG GCACATGAGG AAACTGCAGC 
12951 TGGGCACCAG CCAGGAGGGG CAGGGGCAGA CGGCGAGCCA TGGGGAGCTG 
13001 CTGACACCAG TGGCTGGGGG TGAGGAGCGG GCTCTGCAGA AGGGCATGGG 
13051 TGGTCCACAA AGGTGCACCC GGGCTGGAGT GGAGGGCCTG CCCCTGCGGC 
13101 CACCTCTGTT CTGTCTTCCC ATGCAGGGCC GGCAGCTGGC TGTTGCTGTC 
13151 GAGACTGCTG CGTGGAGCCG GGCACAGAAC TGTCCCCCAC ACTGCCCCAC 
13201 CAGCTCTAGG GCCCTGGACC TCGGGTCATG ATCCTCTGCG TGGGAGGGCT 
13251 TGGGGGCAGC CTGCTCCCCT TCCCTCCCTG AACCGGGAGT TTCTCTGCCC 
13301 TGTCCCCTCC TCACCTGCTT CCCTACCACT CCTCACTGCA TTTTCCATAC 
13351 AAATGTTTCT ATTTTATTGT TCCTTCTTGT AATAAAGGGA AGATAAAACC 
13401 ATCCTTAGCG CTGTCTCCCT CAATATCCCC CACCCCATCT TGTTGTGCAA 
13451 ACTGACTGCT TGATTTGGGG GTGCCTGGCC TTTGAGGTAG TCACAGGGAG 
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13501 GCCCCTCCCC AACATGAGAC TGGGTGGGGA TGGGGAGAGA GAAGTGGGGA 
13551 ATGGAGGGGA AGGTGCTTGG GGAATTTCTT TGTCCAGGGT GCCCCATCTA 
13601 GCCTTCCQGC CCTTTGGAAC CCTTTCTGCG CTTTGCTGGT GGCTCCTGAG 
13651 CATGGCGQGA TTGGCGCAGG TCGGCACTGA ACAGCACCTG TAGGAGGGTG 
13701 GAGTCTGTGT GGGGAGGAGG GTACACTGGG GTCAGGGCTG GTGAGACTAG 
13751 TGACAGTGTT GGGAGGTGGA AGAGTCCTTG GGGAACAGGG CCGAAGGCAA 
13801 TGAGAATCCA CTGGGGTTGG GACAGGGGTG GCTGGAGAGT CCTTTAGGGC 
13851 CACCTGGGGC GGTGGTGGAA GAGTCCACTG GGTCTGGGCT GGAGGAGAGG 
13901 AAACCTAGGG AGGACACCTA GGTACACTCA CCGCTTGGGC CCAGCCAGCA 
13951 TAAGGTCCCC ACAGGCTCCG GAAAAAGTTT CCTAAATCAG AAGTGATGAG 
14001 ACTAAGTTAT CTGACCCCTT CTGTGACCCA TCAACAGAAG TAGGGTCTGA 
14051 GGGAGAGGTG ACTAAGAGAG AGAGAAGTTT CTACCATCCC AGCCCACTGC 
14101 CAGCCCCTGC AGCCCACTTT CCTCACCCAG TTCCTTGTTG GTCTGGGGGC 
14151 TCGGTCCCTT CGCCTGGGAC GTGGTAGGGT GCCAGCTGTA GTCACGTTGG 
14201 GCAATGTGCC ACATATGGAC ATCCACGGGC ACAGCCTGGG GCTTGTCTAG 
14251 GGCCATCAGG CAGATGCAGT CAGCCACCTT TGACAGACAC AGAATGAGCC 
14301 CTTGTGGAAG AAGGGCAGCA TGTGGCCAGC ATCTTGCTTA TAGCCCCAAA 
14351 GCCGGCTGCT TTCTCCTTCA CTCTGGGGTT ACTGTTGTTC TATATTC TCA 
14401 ATCAACAGAT ACTATCTATG AATACACTTT I I I I I IGI I I G I I I I I GAGA 
14451 TGGAGTCTCG CTCTGTTGCC TAGGCTGGAG TGCACTGGTG CAATCCTGGC 
14501 TCTCCCAGGT TCAAGCAATT CTCCTACCTC AGCCTC CCAA GTAGCTGGGA 
y, 14551 TTACAGGCAT GTGCCACCAC GTGTGGCTAA 1 1 1 1 IGTGTT TTTAGTAGAG 
H 14601 ATGGGGTTTC ACCATGTTGG CCAGCCTGGT CTCGAACTCC TGACCTCAAG 

□ 14651 TGATCTGTCC ACCTTGGCCT CCCAAAGTGC TGGGATTACA GGCGTGAGCC 
jj 14701 ACCATGCGCG GCCTATGAAT ACACTGAAAT TGCTGTAATA AGAGGTGCTA 
m 14751 CTAGCTGAAC ACCTATGTGG GCCAGGTTAT CATAACCTGG GAAGAAGGTA 

14801 TTACCACACC CACTTTACAG ACAAGAAAAC TGAGGCTTTG AAAGGTGAAG 
;i 14851 TGACCTGGCC AAAGTCACAT GGCTGAGAAT AGGCAGAACC AAGATTTAAT 
r 14901 GTTAGGCTGT AGTCCAAAGC CCATCAAAAA AAAATCTTTA AGCAAAAATT 
ru 14951 CAI | | | | IAA ACTACAGAGA AGTATAAAGA AAAAAAAAGG CTGGGTGCAG 
L, 15001 TGGCTCACGC CTGTAATCCC AGCACTTTGG GAGGCTGAGG CAGGTGGATC 
P 15051 TCGAGGTCAG ATTGAGACCA TCCTGGCCCA ACATGGTGAA ACCCCATCTC 
W 15101 TACTAAAAAT ACAAAAATTA GCTGGGTGTG GTGGCGCATG CCTGTAATCC 
P 15151 CAGCTAATCT GGAGGCTGAG GCAGGAGAAT AGCTTGAAGC CGGGAGGCGG 
W 15201 AGGTTGCAGT GAGCCGAGAT TGCACCACTG CACTCCAGCC TGGCAACAAA 

□ 15251 GCAAGACTCC ACCTCAAAAA AAAAAAAAAA AAGACAAATG CCTAATTTCC 
fU 15301 AGTCATCTTA TTGCCAGTTA ACCCTATTGA CATCAAGCAA AAAGTTTTGT 

15351 CAGTACATGT CATTTTACGA AAGGAACAAA ATGTGGCCGG GAGCAGTGGC (SEQ ID NO: 3) 



FEATURES: 

Genewise results: 

Start: 3000 

Exon: 3000-3082 

Exon: 7535-7609 

Exon: 7696-7834 

Exon: 8991-9117 

Exon: 9212-9287 

Exon: 9980-10092 

Exon: 10998-11076 

Exon: 11173-11260 

Exon: 12902-13019 

Exon: 13127-13206 

Stop: 13207 
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Sim4 results: 



Exon: 3000-3082, 

Exon: 7535-7609, 

Exon: 7696-7834, 

Exon: 8991-9117, 

Exon: 9212-9287, 

Exon: 9980-10092, 

Exon: 10998-11076 

Exon: 11173-11260 

Exon: 12902-13019 

Exon: 13127-13209 



(Transcript Position: 1-83) 
(Transcript Position: 84-158) 
(Transcript Position: 159-297) 
(Transcript Position: 298-424) 
(Transcript Position: 425-500) 
(Transcript Position: 501-613) 



(Transcript Position: 614-692) 
(Transcript Position: 693-780) 
(Transcript Position: 781-898) 
(Transcript Position: 899-981) 



CHROMOSOME MAP POSITION: 

chromosome 3 



ALLELIC VARIANTS (SNPs) : 

DNA 



Position 


Maior 


Minor 


Domain 


1892 


A 


G 


Intron 


3351 


G 


A 


Intron 


8636 


T 


A 


Intron 


8805 


T 


C 


Intron 


9802 


T 


A 


Intron 


9833 


C 


G 


Intron 


11352 


C 


T 


Intron 


13319 


T 


C A G 


intron 


13659 


C 


G 


intron 


14292 


G 


C 


intron 



Context: 

DNA 

Position 

1892 GGTGTTCGGGTGCAGGGCTGTCTGGGGCACTGTGTGGTGTGGA 
GGGAGTACATGTATGATCAGGTGTCACGGGATCT 
GGCAGGTGTTTGAGTTCAGGGCTGTGGAGGGGGCTTGGTGTGGCATGTCT 
TGTGTGTGGATOGTGAGGGTTGTATTTOT 
TCTAGAGCTTACACTCGAGTCTCCTTTCCTAGMGATTCTGC 
[A,G] 

GTCCCCTGGGAAAAAGGTCCT GTTCCAGGAGTGGAATCTCACACCAGAGGCCCTAGTCAG 
GGCACCTTCTCCTCATTCTCCCTTAGAGAAAMGA 

TCACAAAGCATGCTGGGCTCTG I I I I GGCCTCATCTGTGGATGGGTTGGGAGGCTGTGTT 

(ITCTGAATGGGGCCCATTCTGGCTTCATATTGGAAGTACCAGCCAA 

CTTTGCCCTCAGCMGCrTAGCTOSGGGCCCCAGGCCAGGTCTC^ 

3351 GA(IATCTACGACTTCCGAGATGTTCTGGGCA 

ATGGCTGAGGGAGGCTGAGTCCAGGGTGGGGCTC 

TCCCACAGCCCAGGCCCTGTGGCAGCACTATCTAGGGCCTAMCTCT 

OTCTGGATGA(^GTGGGTGGGACACGGGCTGCrCTCCCAATAGCCCTGGG I I CI IGAAG 

AGAMGMGTCGAGAGMTGMGGTGCCAGTCAGTCCATTO 

[G,A] 

GTGTTCTAGCCTAGGTTTGGGMCTGAGGCTC^AGATGGCTCTG 1 1 C I I GGTGCTGGGAA 
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TG^GAAATMCTO^CCTGGTCTCTGCCCTTCAAGrn-GATCCCAGAOVTGrGCAAGAG 
ACAGACCTA^GAAAATGAC^^GGGTGTGTGCTGrGCTCCMTTMGGTTGGGATTGA 
GGGCTTTGTGGAGCCO\GAGAGAGCTGTGCCT^^ 
TGGGGCATTAGAGCTGGGGACTGMGGATGGGTAGGTGTO 

8636 AGAGCGAGCCTGGACGACAGAGCGAGACTCCATGTCAAAAAAAATAAMTAAAMCAAAA 
MTCCTATTCCCCITCTGTAGAAMCTTGGATGGGACAGCAAMCATAAA 
AGAMTCCCCGAMTCCTACTCCTCGGAMTAGCGACGGGGCTCACATTTAGCAGrACAT 
CTCMTCCGTTCrAGGAGAAGGGCACTTGGGGTGTGACATGCCTGG 1 1 I IGAATTCTGGC 
TCTGCTACTGCCTMCTGTGGG I ICI I GGGTGAGTCACTTTGCCTCCAAAGGCATCAGTT 
[T,A] 

CCTCATCrGTTAGCTGAGATTATAO\GACTGGCCTAG<^GGGMGO\GTGAGGATGGCAT 
TAMTCMGCACTMTCCAGGCTCTGGCAT 

TTTACAGTGCACACCrGAGGTTTAGAGACAGTrCCCCCCCACACCCTC^ 
CTTCCTGGAAI I I I IGGCCI ICI I GAGAGCTTCCTTGA I I I I C_ I I ATGACAGCCATGAAG 
CCACAGTGGCTITTGGGGATCCATTATTTCrCAGMGGTGCrrGG^ 

8805 TAGCAGTACATCTCAATCCGTT CTAGGAGAAGGGC^CTTGGGGTGTGACATGCCTGGTTT 
TGAATTCTGGCTCTGCTACTGCCTAACTGTGGGI ICI IGGGTGAGTCACTTTGCCTCCAA 
AGGCATCAGTTTCCTCATCTGTTAGGTGAGATTAT^ 
TGAGGATGGCATTAMTCMGCACTMTCCAGGGTCTGGCATAAMTAGG 

U TTCCTTT AGGGCTTTACAGTGCACACCTGAGGTTTAGAGACAGTTCCCCCCCACAC^ 

m [T,C] 

n TGAGCCTTGTCCTTCCTGGAA 1 1 1 1 IGGCCI ICI I GAGAGCTTCCTTGA 1 1 1 ICI IATGA 

S CAGCCATGAAGCCAOVlGTGGCTTTTGGGGATC 

£j GCAGMGGTTCTACCAGCCTCTMCCATCT 

£ TCMGCCAGAGMTCTGCTGTACTACAGCCTGGATGMGACTCCAA 

W ACTTTGGCCTCTCCMGATGGAGGACCCGGGCAGTGTGCTCTCCACCGCCT 

P 

W 9802 AGACAAAGTGCCTGCCCTCAGGGAGCTGACI I ICI I I CTAGT AGGGAAGACAGACAATCA 

ACMGTAAATAMTCTACAMCTQ^CGTCAGGTGATAAAAATAM 

P CCMGCAGGMTAGGG^GACGGGGrGATGCCATTTCAGTAGGGAGGTCAGQ 

W GCTGTO^GGTGATGACCGAGTGGTGAGGGAGCCAGACATTGGAGGTGTG 

P GGCATAGGCAGAAGCAATGGCAACTGCAMGGCCCTGAG 

in [t,a] 

P 0\AGGAACAGAMGGATAATGTAGCTAGAACAGGAGTGAGCAGGC7\GGGCT 
flj TATAAAGGGGGMCTCOTCCATGGCTCCTGCCTGACCCCTGAGACT^ 

ACCCCGGAGCCAACGGCACCCGAMGTGGAMTGAGGATGAGTTTCTCCCTGCCCAGGCT 

CTGCGGTTACCCTCCCTTCTATGACGAGMTGATGCCAMCTCT^ 

GGCCGAGTACGAGTTTGACTCTCOTA^ 

9833 TTCITTCTAGTAGGGMGACAGACMTCAAC^ 

GTGATAAAMTAMTACTGTGGAGAAAMCCMGG^GGMTAGGGAGACGGGGTGATGCC 
ATTTCAGTAGGGAGGTCAGGGMGGGCTCGCTGTGGAGGT GATGACCGAGTGGTGAGGGA 
GCCAGACATTGGAGGTGTGGGGAMGAGTGGCATAGGC^GAAGCAATGGCAAGTGO 
GCCCTGAGGAGGGCMGATGGCGGCAC^TAC^GGMC^GAMGGATMTGTAGCT 
[C,G] 

AGGAGTGAGCAGGCAGGGCTGGTAGAGTTrATAMG^ 
CCTGACCCCTGAGACTGCCCCAGTGCTCCAC^^ 
ATGAGGATGAGTTTCTCCCTGCCCAGGCrCTGCGGTTACCCTCCC^ 
GATGCOW^CTCTTTGMCAGATITTGMGG^ 

GACGACATCTCTGACTCTGGTATTTGGGGCXTTGC I I I I I I CCCCTGGGCCCTGCCTCTG 
11352 GTGAGCAGGCCTTGCAGCACCCATGGTGAGAATTCACACMC 
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TTTGGGGCCCTCAGGTCTGCTTCTGCCCTCATAGGCM 

GGATTGCAGGAGATACAGCTCTAGATAAGAATATCG\CCAGTCGGTGAGTGAGCAGATCA 

AGAAGMCTTTGCCAAGAGCAAGTGGMGGTGAGTCCATATCCCT^^ 

CTCCCCAGGACTCCTCCCCATCCCTACCCAGGCT^^ 

[C,T] 

TGGGCACACAGT AACrGCITAGGGATCCTTACrGAAGGAClTCATTCATTCACTCr^ 
TTCMCAAACACTCCCMCACCTTCTCTATTCCAG^ 

GAGGAAGMGrCTGTMTTCTTCAGGAGGCATCrGATC<^GCCTATGGGGTCCGAGAAAG 
GTCATAAMGTGGTGATGACCTGACAGAGCTGTCAGTTAAGT AGGMTTAGTGAGGCATA 
GCGGAATMTCTCTATAGCCATTCCGGGAAGTGCMCTGCTAAGCCTGGCCAGACTGGAG 

13319 GGTGAGGAGCGGGCTCTGCAGAAGGGCATGGGTGGTCCACAMGGTGCACCC 
GTGGAGGGCCTGCCCCTGCGGCCACCTCTGTTCTGTCTTCCCA^ 
GCTGTTGCTGTCGAGACTGCTGCGTGGAGCCGGGCACAGMCTGTCCCCCACACT 
ACCAGCTCTAGGGCCCTGGACCTCGGGTCATGATCCTCTGCGTGGGAGGGCT^ 
GCCTGCTCCCCTTCCCTCCCTGMCCGGGAGTTTCTCrGCCCTGTCCCCTCCT 
[T,C,A,G] 

TCCCTACCACTCCTCACTGCAT^ 

TAATAAAGGGMGATAAMCCATCCTTAGCGCTCT 

TTGTTGTGCAAACTGAaGCTTGATTTGGG 

GGCCCCTCCCCAACATGAGACTGGGTGGGGATGGGGAGAGAGAAGTGGGGAATGGAGGGG 
y, AAGGTGCTTGGGGAAI I I LI I I GTCCAGGGTGCCCCATCTAGCCTTCCGGCCCTTTGGAA 

h 

n 13659 CTATTTTATTGTTCL I I CI I GTAATAAAGGGAAGATAAAACCATCCTT AGCGCTGTCTCC 
CTCAATATCCCCCACCCCATCI IGI I GTGCAMCTGACTGC1TGATTTGGGGGTGCCTGG 
CCTTTGAGGTAGTCACAGGGAGGCCCCTCCCCAACATGAGACTGGGTGGGGATGGGGAGA 
GAGAAGTGGGGAATGGAGGGGAAGGTGCTTGGGGAA 1 1 I LI I I GTCCAGGGTGCCCCATC 
TAGCCrrCCGGCCCTTTGGMCCCTTTCTGCGCTTTGCrGGTGGCT 
[C,G] 

ATTGGCGCAGGTCGGG^CTGMG\GG^CCTGTAGGAGGGTGGAGTCTGTGTGGGGAGGAG 
6 GGTACACTGGGGTCAGGGCTGGTGAGACTAGTGACAGTGTTGGGAGGTGGMGAGTCCTT 
O GGGGMCAGGGCCGAAGGCAATGAGMTCCACTGGGGTTGGGACAGGGGTGGCTGGAGAG 
W TCCTTTAGGGCCACCTGGGGCGGTGGTGGMGAGTCCACTGGGTCTGGGCTGGAGGAG^ 
P GAAACCT AGGGAGGACV^CCTAGGTACACTCACCGCTTGGGCCCAGCCAGCATAAGGTCCC 

m 

Q 14292 AGTGATGAGACTAAGTTATCTGACCCCTTCTGTG^ 

fU GGAGAGGTGACTAAGAGAGAGAGAAGTTTCTACCATCCCAGCCCACTGCCAGCCCCrGCA 
GCCCACTTTCCTCACCCAGTTCLI IGI I GGTCTGGGGGCTCGGTCCCTTCGCCT GGGACG 
TGGTAGGGTGCCAGCTGTAGTCACGTTGGGCAATGTGCCACATATGGACA^ 
CAGCCTGGGGCTTGTCTAGGGCCATCAGGCAGATGCAGTCAGCCACCTTTG^ 
[G,C] 

MTGAGCCC7TGTGGAAGMGGGCAGCATGTGGCCAGCATCTTGCTT ATAGCCCCAAAGC 
CGGCTGCTTTCTCCTTCACTCTGGGGTTACTG I IGI ICTATATTCTCAATCAACAGATAC 
TATCTATGAATACAL I I I I I I I I IGI I IGI I I 1 1 GAGATGGAGTCTCGCTCTGTTGCCTA 
GGCTGGAGTGCACTGGTGCAATCCTGGCTCTCCCAGGTTCAAGCMTTCTC 
CCTCCCMGTAGCTGGGATTACAGGCATGTGCCACCACGTGTGGCTAA I I I I IGTGTTTT 
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FIGURE 31 



